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Abstract 


Recent research has revealed new and unexpected applications of 
network control science within biomedicine, pharmacology, and medical 
therapeutics. These new insights and new applications generated in 
turn a rediscovery of some old, unresolved algorithmic problems. One of 
these problems is the Structural Target Control optimization problem, 
known in previous literature also as Structural Output Controllability 
problem, which is defined as follows. Given a directed network and a 
target subset of nodes, the task is to select a small (or the smallest) 
set of nodes from which the target can be independently controlled, 
i.e., there exists a set of paths from the selected set of nodes (called 
driver nodes) to the target nodes such that no two paths intersect 
at the same distance from their targets. Recently, Structural Target 
Control optimization problem has been shown to be NP-hard, and 
several heuristic algorithms were introduced and analyzed, both on 
randomly generated networks, and on biomedical ones. 
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In this paper, we show that the Structural Target Controllability 
problem is fixed parameter tractable when parameterized by the number 
of target nodes. We also prove that the problem is hard to approximate 
at a factor better than O(log n). Taking into consideration the real case 
formulations of this problem we identify two more parameters which 
are naturally constrained by smaller bounds: the maximal length of a 
controlling path and the size of the set of nodes from which the control 
can start. With these new parameters we provide an approximation 
algorithm which is of exponential complexity in the size of the set of 
nodes from which the control can start and polynomial in all the other 
parameters. 

Keywords: Structural control, Network control, Optimization algo- 
rithm, Fixed parameter algorithm, NP-hardness, Linear networks. 


1 Introduction 


The network control research field has been investigated for more than 50 
years, with some of its algorithmic questions only recently being able to 
be solved. The general topic is concerned with the optimization of output 
intervention needed in order to drive a linear, time-invariant, dynamical 
system from an arbitrary initial state, to a precise final configuration, in 
finite time. Although many real-life dynamical systems tend not to be 
linear, most of these systems are known to be well approximated by such 
dynamics, or could behave as such in specific conditions, such as at their 
steady state. Inquiries into this field have been initiated in the 60’s and 70’s, 
see, e.g. [16, 13, 24]. However, only in 2011 Liu et al. [17] proved that the 
full network control optimization problem can be solved in polynomial time 
via a reduction to the maximum matching problem in directed graphs. The 
result received a lot of interest, and sparked a renewal of the field. Since 
then, the network control theory and its newly discovered results have been 
successively applied to the study of control over power grid networks [12], 
of biomedical signaling processes [14, 11, 27], and even the control of social 
networks [15, 17]. 

Driven by this new insight into the field as well as by its new applications 
into the current world of Big (or just Large) Data, researchers have realized 
that full control can sometimes still be too expensive. For example, network 
control theory has been recently applied in the case of cancer-related bio- 
medical networks [14, 11], with the aim of using known drugs in order 
to drive the system towards a more favorable state. Thus, researchers 
aimed to use the protein signaling network in order to drive cancerous 
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cells towards apoptosis, i.e., programmed cell death. However, the full 
controllability of sparse homogeneous networks, such as many bio-medical 
networks (e.g., gene signaling networks, metabolic networks, gene regulating 
networks, etc.) requires a lot of effort, sometimes needing a direct outside 
control over up to 70% of the initial nodes of the network [14, 17]. As 
in these cases an outside control equivalents to the use of specific drugs, 
and since these protein networks contain up to 2-3 thousands nodes, a 70% 
direct outside control would imply an unfeasible solution. Thus, we have a 
new controllability problem, that is a variant of the initial control-theory 
problem, namely that of target-control. Instead of enforcing the control of 
the entire network, one alternative goal is to optimize the outside intervention 
needed to control only a well-specified target, i.e., a subset of the initial 
network. The aforementioned goal proves to be particularly well-fitted with 
the study of protein signaling networks, as recent research has emphasized 
the existence of disease-specific essential genes, i.e., disease-specific sets of 
genes/proteins which, if knocked down, would drive the corresponding cells to 
apoptosis [2, 28, 31]. As is the case, new formulations lead to new problems. 
The Structural Target Control (optimization) problem (STC) [10, 4] asks 
to provide an optimum amount of outside intervention in order to drive a 
linear dynamical system from any initial state to a desired final state of the 
chosen targets. 

Contrary to the full network control case, the Structural Target Con- 
trollability problem was proved to be NP-hard [4]. Several heuristic ap- 
proaches have been implemented and applied to the study of biomedical 
networks [10, 4, 14, 11]. However, approximation algorithms for this problem 
are not known. 

Assuming the widely believed conjecture, that P 4 NP, no polynomial 
time exact algorithms exist for any NP-hard problems. Thus, there are 
several alternative methods to tackle the difficulty of these problems, such as 
approximation algorithms and fixed parameter algorithms. Approximation 
algorithms run in polynomial time and provide a suboptimal solution. Nev- 
ertheless, unlike heuristic algorithms, approximation algorithms guarantee 
that on every input instance the solution they return is within a certain 
factor of the optimal solution. For example, a 2-approximation algorithm 
for a minimization problem guarantees that on every input the solution 
returned is at most twice the cost of the optimal solution on that input. 
However, some problems, such as the one studied in this paper, might not 
have approximation algorithms with a constant approximation factor, unless 
P= NP. See [26] for a textbook on approximation algorithms. 
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In practice, many problems have parameters that are typically much 
smaller than the input size. We can exploit the existence of these parameters 
in order to design faster algorithms for these problems. Parameterized 
complexity [9, 5] aims to classify problems according to various parameters 
that are independent of the size of the input. A fixed parameter algorithm 
runs in time f(k)O(n‘°), where n is the input size, c is a constant, and k 
is the value of a parameter (independent of the input size). A problem is 
termed fixed parameter tractable (FPT) if it has an FPT algorithm. 

In this paper we show that the Structural Target Controllability problem 
is fixed parameter tractable when parameterized by the number of target 
nodes. Also, if a second parameter is allowed, namely the maximal length 
of a controlling path (which is known in practice to have low values), the 
resulting fixed parameter algorithm has a considerably improved complexity. 
Moreover, we formally prove that the STC problem is hard to approximate 
within a factor better than O(log n). 

Taking into consideration the medical and pharmaceutical insights on 
how this problem is formulated in the biomedical setting, we identify yet 
another parameter which is bounded by a lower value. This parameter is the 
size of the set of nodes from the network which can potentially be influenced 
by outside interventions, i.e., from which we can select our controlling 
nodes. These nodes correspond to known proteins which are targets of actual 
drugs, aka. drug-targets. The resulting formulation of the problem, i.e., the 
Driver Restricted STC has itself a fixed parameter algorithm, which has 
a considerably improved time complexity. However, even with the above 
mentioned additional constraint, the problem is intractable for real-case 
networks consisting of 100+ nodes. Thus, we design an approximation 
algorithm, which is of exponential complexity only in the size of the set of 
nodes from which the control can start and low polynomial in all the other 
parameters of the problem, i.e., the total number of nodes of the network, 
the size of the target set, and the maximal length of a controlling path. 


2 Notation and Preliminaries 


A linear, time invariant dynamical system (LTIS) is a system 


dx(t) 
= Ax(t 1 
no) = a(t) (1) 
where x(t) = (a1(t),...,@n(t))” is the n-dimensional vector describing the 


system’s state at time t, and A € R”*” is the time-invariant state transition 
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matrix. The elements in x are called the variables of the system. We denote 
by X the set of these variables. 

The external control over the system is performed through the action 
of m external driver nodes, u(t) = (u1(t),...,Um(t))”. Their influence over 
the n variables of the system is described by the time-invariant input matrix 
Be R"*™; then the LTIS (1), now denoted as (A, B), becomes: 


dx(t) 
dt 
Let TC X, T = {t1,...,t,} for some k < n be a subset of a particular 
interest for the variables X, a.k.a., the target set. We say that the LTIS 
(A, B) is T-target controllable if for any initial state of the variables in X 
and any target variables, there exists a time-dependent input vector u(t) = 
(ui(t),...,Um(t))” that can drive the system in finite time from its initial 
state to a state in which the target variables are in the desired final setup. We 
associate to the k-target set T the characteristic matrix Cr € {0,1}**”" where 
Cr(i,j) = 1 iff = 7 and i,j € T (otherwise, Cr(i,7) = 0). It is known, see 
g. [10], that a system (A, B) is T-target controllable if and only if 


= Ax(t) + Bu(t) (2) 


rank OC(A, B, Cr) = |T| (3) 


where the matrix OC(A, B, Cr) := [CrB | CrAB | CrA?B|...| CrA"'B] 
is called the controllability matriz. 

In the particular case when the target is the entire n variable set X, 
the above condition translates to the well known Kalman’s condition for full 
controllability [13], i.e., an LTIS (A, B) is (fully) controllable if and only if 
rank[B | AB| A?B|...| A" 1B] =n. 

The notion of target controllability and the focus of imposing a con- 
trolling effect only on a subset of the variables of the system, has been 
introduced and studied only recently, see e.g., [10, 4, 14, 11]. However, this 
notion can be seen as a special case of output controllability, a topic which 
received considerate attention in the 80’s and 90’s, see. e.g. the works of 
Poljak and Murota [21, 22, 20]. 

Although the control methodology seems to be very dependent on the 
input data, i.e., the transition matrix A, it turns out that this is not the case. 
We say that an LTIS (A, B) is T-structurally target controllable (with respect 
to a given size-k target set T) if there exists a time-dependent input vector 
u(t) = (uy(t),...,Um(t))’ and matrices A and B with non-zero values, that 
can drive the state of the target nodes to any desired output in finite time. A 
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deep result of [16, 24] shows that a system is structurally target controllable 
if and only if it is target controllable for all structurally equivalent matrices A 
and B , except a so-called “thin” set of matrices; we say that two matrices 
are structurally equivalent iff they have the same dimensions and differ only 
on their non-zero values.” Thus, almost all matrices A and B are “a good 
choice”. According to equation (3) above, for a k-sized target T, a system 
(A, B) is structurally T-target controllable if and only if there exist values 
for the non-zero entries in A, B such that rank OC(A, B,Cr) = |T| =k. 

It was shown in [4] that from a practical perspective, it is more mean- 
ingful to analyze the controllability optimization problem from the point 
of view of minimizing the number of driven nodes. Thus, we focus on this 
particular formulation of the optimization problem. Thus, we impose that 
each driver node is connected to exactly one driven node, i.e., in the matrix 
representation of the above network we require that the input matrix B 
contains exactly one non-zero element on each column. We define the notion 
of optimization for structural target controllability in case of LTIS as follows: 


Definition 1 (The Structural Target Control (Optimization) prob- 
lem in case of LTIS) 


Input: The size-n variable set X, the associate transition matrix A of size 
nxn, and a size-k target subset TC X, withk <n. 


Output: Matrix B of sizen x m such that 


(a) every column of B contains exactly one non-zero value, 

(b) SrankOC(A, B,Cr) = k, where SrankOC(A, B,Cr), is the 
generic rank (or structural rank) of the structural ma- 
triz OC(A, B,Cr), t.e., the maximal value for the rank of 
OC(A, B,Cr) for matrices A, B, and Cr that have non-zero 
values on the non-empty entries of A, B, and Cr, respectively. 


(c) m (i.e., the number of columns of B) is minimum among all 
feasible matrices. 


It is known, see e.g. [21, 22], that the structural controllability problem 
has a counterpart formulation in terms of graphs/networks. Given an LTIS 


5It is beyond the goal of this paper to define the topological notion of thin sets; we only 
give here the intuition that such sets consist of isolated cases that may be easily replaced 
with nearby favorable cases. 
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(A, B), we associate to it the graph G(4,2) = (V, EZ) where the n variables 
of the system {21,...,2%,} and the size-m external controller {u1,...,Um} 
are the nodes of the graphs, while directed edges correspond to the non-zero 
values in the state transition matrix and input matrix, respectively. That 
is, there exists a directed edge from the node corresponding to variable x; 
to the node corresponding to x; if and only if A(x;,2;) 4 0.° Similarly, 
there exists a directed edge from u; to x; if and only if B(aj,uj) A 0. 
The nodes {u1,...,Um} are called driver nodes, while the nodes x; such 
that there exists i with B(a;,u;) 4 0 are called the driven nodes of the 
network. In the literature, the driver and the driven nodes are sometimes 
known as input and controlled nodes [10, 17]. To a rough understanding, 
the difference between driver and driven nodes is as follows. The set of 
driver nodes is describing the complexity of an outside controller, assuming 
this controller can interact /influence independently several well specified 
nodes of the network. Meanwhile, the set of driven nodes provides the exact 
collection of network nodes that are used in order to ultimately control the 
entire set of targets. From an algebraic perspective, the number of driver 
nodes is given by the number of (nonzero) columns of the control matrix B, 
while the number of driven nodes is given by the number of nonzero rows 
of B. 

Given an LTIS (A, B) and its associated graph G(4,8) = (V, E), the n 
variables of the system are (all) structurally controllable from the m-sized 
input controller u (and control matrix B) if and only if we can select a 
set of n directed paths from driver nodes as starting points (we denote 
this set as /) to each of the network nodes, as ending points, such that 
no two paths would intersect at the same distance d from their end points. 
The above formulation is closely related to the concepts of linking and 
dynamic graph as investigated in [22, 21]. In case of the target controllability 
problem, for a given target set T = {t,,to,...,t,} C X, the above graph 
formulation is naturally adjusted as follows. We introduce k new output 
nodes Cr = {c1, C2,.--, Cx} (also denoted as C when clear from the context) 
and edges (ti, c), for alll <7 < k. Note that the output matrix Cr describes 
exactly the above mapping. Now, the objective is to find a path family 
containing & directed paths, connecting all the driver nodes (as start-points) 
to the output nodes (as end-points), such that no two paths intersect at the 
same distance d from their end-points. In contrast to the case of full control, 
the graph condition is only necessary for target control, but not sufficient [21]. 


°We implicitly interchange the usage of x; and i for matrix indices. 
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However, as investigated in [4], only in very restrictive cases the existence 
of such a path family does not translate into the algebraic definition of 
structural control. Thus, from all practical purposes, the algorithmic process 
of finding such a family of & directed paths is equivalent to verifying that 
the system is structural target controllable. 

We give now the formal definition of the Structural Target Control 
(Optimization) problem in terms of graph theory. 


Definition 2 (The Structural Target Control (Optimization) prob- 
lem in terms of graphs (in short STC)) 

The input consists of a directed graph G = (V,E) and a set of nodes 
T = {ti,to,...,th} GC V. The goal is to find a set of nodes S C V of 
minimum cardinality that controls T. A set S C V controls T if there 
exists k paths, P1,P2,...,Pr, where P; starts with a node in S and ends 
with t; and any two paths P; and P; do not intersect at the same distance d 
from their endpoints. 


3 Fixed Parameter Algorithms 


In this section we prove that the STC problem is fixed parameter tractable, 
when parameterized by several variables of our problem. First, we show 
that one parameter, namely the number of target nodes |T| = k, suffices in 
generating such a fixed parameter algorithm. On the other hand, considering 
the practical instances that motivate this problem, namely the targeted 
control of human protein signaling networks in cancer, we identify several 
other variables of this problem which are known to have significantly lower 
values, i.e., one or even two orders of magnitude lower than the total 
number of input nodes. Thus, we design more efficient FPT algorithms 
for the structural target control optimization problem using several other 
parameters. 


3.1 A One-parameter STC Algorithm 


In this subsection we present the FPT algorithm parameterized only by 
|T| = k, the size of the target set. Our algorithm uses as a subroutine an 
algorithm for the Set Cover problem and, thus, we first define the Set Cover 
problem. 
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Definition 3 (Set Cover) Given a universe of elements U = {u1,...,ux} 
and a family consisting of n subsets of U, S = {$1,..., Sn}, find the smallest 
sub-collection S' C S, such that the union of all the sets S’ is U. 


Informally, our algorithm carries out the following steps. Firstly, for 
each node v in the input graph, we compute all possible subsets of T that v 
can control. Since |T| = k, there can be at most 2” such subsets for each 
node v. Then, we enumerate over all possible subsets of 27 (notice that there 
are precisely 22" such subsets). For each such subset D C 2”, we check if 
there exists a collection of |D| nodes, such that each node controls precisely 
one set in D. If so, we solve exactly the set cover instance (D,T) and store 
the solution if it is better than the previously found solutions (i.e., needs less 
nodes than the previous solutions to control the target nodes). Algorithm 1 
describes our procedure in detail. 


Algorithm 1 An FPT algorithm for the STC problem 
Input: An directed graph G = (V, F) and a set of nodes T CV, |T| =k 
Output: A set of nodes S C V of minimum cardinality that controls T. 


1. For every node v € V, compute all possible sets of target nodes Cy € 27 
that v can control at the same time. 


2 OPT 26,5 =0 


3. For every D = {Dj, Do,...,De} CS 2? such that there exist nodes 
V1, V2,---,vg such that Cy, = Di, Cy, = Do,...,Cy, = De, do: 


(a) Solve exactly the set cover problem on instance (D,T). 


(b) Let D! = {Du,,; Duz,---; Du, } be the sets in the optimal set cover. 
Ta < OPT, then: OPT ::= @ and S 7= (ui, tie, «5 Ugh: 


return S' 


Before we show the correctness of Algorithm 1, we prove the following 
lemma. Informally, Lemma 1 allows us to perform step 3b) of Algorithm 1, 
that is to safely combine the sets controlled by two or more different nodes. 


Lemma 1 Assume that the sets Du,, Dug,..-, Du, GV are controlled by the 
nodes U1, U2,...,Uxe € V, respectively. Then, the set S := {uj,U2,...,Ux} 
controls Dig U Dy, Us UI Dye. 
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Proof: Observe that it is enough to prove the lemma for two subsets D,,, 
(simply denoted as D, in the following) and D,,, (denoted as D2) controlled 
by the nodes uz and ug, respectively; the generalization follows immediately. 

Let A be the generic matrix associated to our graph, and B,, By be the 
column vectors describing the action of input nodes u, and we over the net- 
work. Then, by Equation (3) above and Definition 1, there exist values for 
the non-zero entries of A, By; and Bo, such that rank OC(A, B,,Cp,) = 
rank[Cp, Bi|Cp, ABi|...|Cp, A"! Bi] =|Di| and rank OC(A, B1,Cp,) = 
rank|Cp, B2|Cp, ABo| Axes |Cp,A"—! Bo] = | Do}. 

Let My, Mo,...,Mjp,, and Ni, No,...,N)p,; denote some linear inde- 
pendent columns from OC(A, Bi,Cp,) and OC(A, Bz, Cp,), respectively, 
such that det(Mi|M2|...|Mp,) #0 and det(Ni|No|...|Np,) 4 0. 

Let D=D,UD2, B=|B,| Bg], and let investigate the rank of OC(A, B,Cp): 
|D| > rank OC(A, B,Cp) = rank[CpB|CpAB|CpA?B|...|CpA”-1B] = 
rank|CpB,|CpAB\| aie CpA"—'B\| a5 |\Cp B2|CpABy| lasts Cp A”! Bal] > 
rank[M,|M9|...|Mjp,\|NilNo|.--|Njp,|] = |D|; where the M’s and N’s 
columns are obtained by extending the M’s and N’s columns to the entire 
domain D, and the last equality can be deduced for example by performing 
the Gaussian elimination steps specific to matrices [M,|M2|...|Mp,] and 
[Ni|No|...|Np,], respectively. 

Thus, rank OC(A, B,Cp) =|D|, which means that within the current 
network, the set {w1, ug} is controlling the nodes in D = D, U De. 

The correctness of Algorithm 1 follows from Lemma 1. The next theorem 
analyzes the running time of the algorithm. 


Theorem 1 Given a graph G = (V,E), such that |V| = n and a target 
set T C V with |T| = k, Algorithm 1 solves the STC problem in time 
O(f(k)p(n)). Thus, the STC problem is fixed parameter tractable. 


Proof: We present in more detail and analyze the running time of each 
step of Algorithm 1. 

Step 1. For each node v € V, we compute and store as follows all 
the sets of nodes in T that v can simultaneously control. Firstly, we show 
how to decide if a node v € V covers a given subset of nodes JT” C T in 
polynomial time in |V|. Given a set of nodes X C V, let N(X) be the open 
neighborhood of X, that is N(X) = {v © V:dae€ X s.t. (va) € E}. We 
define the graph G, 7 = (V’, E’), where: 


1. Let Tp = T and Ti; = N(T;), VO < i <n. The node set V’ of the 
graph G,, 7 consists of all the sets T; plus two other nodes {s,t}. Since, 
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the node set V’ may contain more copies of the same nodes from V, we 
refer to a node p € V that is in the set 7; as p’. Notice that a node p 
cannot appear twice in a set 7;. 


2. In the edge set E’ of the graph G7" we add an edge (a’*",b’) if 
(a,b) € E. Moreover, we add an edge between (s,v’), if vt € Tj. 
Finally we add an edge (a,t), Va € T’. 


The node v can control simultaneously the nodes in the set T”’ if and only 
if there exists k/-node disjoint paths from s to t, where k’ = |T’|. Observe 
that the graph G7 was constructed such that any two node disjoint paths 
from s to t in G,,7 correspond to paths in G' from v to a node in T", paths 
that do not intersect at the same distance from the nodes in T’. The k nodes 
disjoint paths problem between two nodes is solvable in time O(k(n + m)) 
on a graph with n nodes and m edges [3]. Thus, since Gyr has at most n? 
nodes and n° edges, finding k disjoint paths between s and t takes time at 
most kn°. 

Then, to complete Step 1 of Algorithm 1, we repeat the procedure 
described above for every node v € V and any subset T’ C T. Since there 
are 2" subsets of T, the total running time of Step 1 of Algorithm 1 is 
O(k2*n*). 

Step 3. Any set C, has at most & elements and, thus, any set D has 
at most 2” elements. Moreover, to decide if we enter the loop in Step 3, for 
every set of D we check if it is one of the sets Cy, computed at Step 1. Thus, 
the complexity of Step 3 of Algorithm 1 is O(n4*2?") times the running time 
of Steps 3a) and 3b), where the n comes from the running time required to 
compare two sets of size n. 

Notice that since the number of sets in the set cover instance is bounded 
by 2* and the number of elements is k, then we can solve the set cover 
in O(2"2*) = O(4*), since one can solve Set Cover with set family F and 
universe U in O(|F| * 2!”!) time. 

Thus, the overall running time of Algorithm 1 is O(k2*n4 + n4?'22"). 


3.2. Towards Efficient FPT Algorithms Using Multiple Pa- 
rameters 


In recent years, the STC problem has received significant attention in 
connection to its applicability in bio-medicine and pharmacology, see 
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e.g. [4, 14, 11]. In this setting, one is required to select a small amount of 
drugs which, by enabling cascading effects in the protein signaling network, 
would drive a set of well established key target nodes/proteins to a particular 
configuration. In turn, this configuration of the target proteins, also known 
as essential proteins, is expected to correlate with a positive therapeutic 
effect over the patient. In this setting, the number of internal nodes of the 
graph G corresponds to the number of proteins within our network, usually 
in the order of 1000 to 3000. Also, the target T will be given by the set 
of disease-specific essential proteins present in the network, which in these 
cases was observed to be in the order of 100 to 200 proteins, i.e., one can 
roughly assume a 1 to 10 ratio between the number of targets and that of 
total number of nodes. What is also specific to this setting is that the size 
of a controlling path, from a driven node to a target, must also be relatively 
small, i.e., smaller than 10 and preferably around 5. This requirement is due 
to the fact that such paths translate to cascading effects in the signaling 
network and, thus, the more intermediary elements within, the less reliable 
the entire process and the desired outcome becomes. 

In the following, we present a fixed parameter tractable algorithm 
for STC whose time complexity is exponential in the parameters k and p, 
corresponding to the size of the target set J’ and the maximum length of 
the controlling path from a driver to a target node, respectively, and low 
polynomial in n, the total number of nodes in the network. The algorithm 
generalizes a Greedy approach first reported in [10] and later analyzed and 
improved in [4, 14, 11]. 


Theorem 2 Given a graph G = (V,E) and a target set T CV with |T| =k 
and |V| =n, Algorithm 2 solves the Target Controllability Problem in time 


e(n kp 
O(kn - (22t8)"). 


Proof: In the following, we present in more detail and analyze the 
running time of each step of Algorithm 2 and of its Control sub-function, 
i.c., Algorithm 3. 

The final controlling set, Se54, can be updated only after p nested 
applications of the iterative Control algorithm. In each of these p nested 
steps, we need to generate a bipartite graph, enumerate all possible maximal 
matchings, and form the set S, which will then be fed into the next application 
of the iterative function Control. While the construction of the bipartite 
graph can be done in O(kn) time, enumerating all its maximal matchings 
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Algorithm 2 An FPT algorithm for the STC problem parameterized by k, 
the size of the target set and p, the maximum length of the controlling path 
Input: A directed graph G = (V, E), a set of nodes T C V, |T| =k, and 
an integer p. 

Output: A set of nodes U C V of minimum cardinality that controls T. 


1. We create a new graph G’ = (V’, E’). For determining V’ we add to V 
a number of k nodes (denoted uj, uz2,...,ux) and for E’ we add to E 
a number of k edges, such that the edge (u; > t;) € FE’, VI<i<k. 


2. We set Siege = T and S = 0. 


3. We apply the iterative algorithm Control (Algorithm 3) for (G’ = 
(V',E’),% =1,T) =T,p, S). 


return Spest 


requires O(n) per maximal matching, see e.g. [25]. In the worst case scenario, 
when we are dealing with a complete graph G, all of the intermediary 
bipartite graphs G; will also be complete. Thus, in each case, the number 
of edges will be bounded by k- (n + k) (since we have |V’| = n + k nodes 
on the left side, and |T;| < k nodes on the right side) while the number of 
maximal matchings will be upper bounded by Gea? Therefore, the overall 
time complexity can be upper bounded by: 


O(kn + ee cine ("E) vem 
ee ee 


p times 


ie., Oe) -kn). The (...) denote that we have kn + Ca) nested. As 


k 
Cr < ees) , we get that the running time of the algorithm can be 


k 
upper bounded by O(kn - cos °), 


Another sensitive parameter which arises from the applicability of 
this method in the medical setting comes from restricting the set of nodes 
from which the control over the target can be initiated, i.e., the set of 
potential driver nodes. This set corresponds to a medium-sized set of 
proteins, called drug-targets, which are known to be directly affected (usually 
down-proliferated) by the use of known drugs. By further specific filtering 
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Algorithm 3 The iterative function Control called in the main program 


Input: A directed graph G = (V, F), an integer i - the current level in the 
linking graph, a set of nodes T;_; - the current target in the i*® level of the 
linking graph, an integer p - the maximum expansion of the linking graph, 
and a set of nodes S - the current solution (incomplete if i < p), 

Output: The set 7; which is the target in the (i +1)" level of the linking 
graph and an update of S, the current solution for the driven set. If 1 = p, a 
possible update of the Sy..¢ solution. 


1. We build a bipartite graph G; with the nodes in V on the left side 
(denoted T;), and the nodes in T;_; on the right side. We add to G; all 
of the edges in EF that have the source node in T; and the destination 
node in T;_}. 


2. We enumerate all maximal matchings in the graph G; between the 
nodes in J; and the nodes in T;_}. 


3. For each maximal matching, do: 


(a) We remove from 7; all of the nodes left unmatched. We add all 
unmatched nodes from T;_1 to S, if these nodes are not already 
there. 


(b) (Optionally, to speed up the search, we check if |.S'| > |Spesz|, and 
if so we backtrack). 


(c) If 2 = p, we add to S all of the nodes in Tj. If |S| < Szest, then 
Shest +S. 


(d) If ¢ 4 p, we repeat again the iterative algorithm for (G’ = 
(V’, E’),i + 1, Tip, S). 


of the types of drugs that the user wants to focus on, the size of this set 
can be further modified. For example, in [14], the authors use the set of 
U.S. Food and Drug Administration (FDA) approved drugs, which selects 
a set of 1500 direct drug-target proteins out of a total of approx 20 000 
proteins’ (excluding post-translational modification and other variants of 
these, such as phosphorilation, acetilation, etc.). This set can be enlarged 


“This number comes from the approximate total of 20 000 genes encoded in the human 
genome. 
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or restricted by making further choices such as: considering also drugs in 
clinical trials, experimental drugs, drugs used in oncology, etc. Overall, such 
a set of potential driver nodes can range from 1/10 to 1/100 of the total set 
of nodes, which is a substantial restricting parameter. However, limiting the 
number of potential driver nodes to a subset S of the graph nodes slightly 
modifies the type of problem we have to study. Indeed, by making this 
assumption we can no more guarantee that the entire desired target can be 
controlled. Thus, the Structural Target Controllability Problem becomes a 
min-max type of question and is defined below. We call this variant of the 
problem the Driver Restricted STC (DRSTC) Problem. 


Definition 4 (Driver Restricted STC Problem (DRSTC)) What is 
the minimum number of driver nodes, selected out of the subset S, which can 
control a maximum number of nodes from T, the target set. Potentially, we 
can also ask for the specific sets of selected driver nodes and the subsequent 
controlled nodes. 


In the following, we provide a fixed parameter tractable min-max op- 
timization algorithm for DRSTC, whose time complexity is exponential in 
the parameters s and p, corresponding to the size of the potential driver 
set S and the maximum length of the controlling path from a driver to a 
target node, respectively. Note that from practical purposes, p is a rather 
small integer, e.g., p < 10. The algorithm is a further tailored variant of our 
initial fixed parameter tractable Algorithm 1. 


Theorem 3 Given a graph G = (V,E), a target set T CV with |T| =k, a 
set of nodes from which the control can be initiated S C V with |S| = s, and 
an integer p corresponding to the maximal length of a controlling path from 
a driver to a target node, Algorithm 4 solves the Driver Restricted Target 
Controllability Problem in time O(nk?*), where n = |V|. 


Proof: In the following, we present in more details and analyze the 
running time of each step of Algorithm 4. 

In Step 1, for each d;, each set ies C T,j < pcan contain at most k 
elements; computing these sets is done in O(nk?), where n = |V| is the total 
number of nodes. Since Ty, C 7’, the maximum number of ways in which 
we can select Tq, in Step 2 is k(k — 1)(k —2)...(k-—p+1) < k?, for each 
d,e S. 

The most computational expensive part of the Algorithm is Step 3, 
where we have to compute all possible unions of |.S| sets, where each set 7 is 
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Algorithm 4 An FPT algorithm for the Driver Restricted STC problem 
parameterized by k, the size of the target set, s, the size of the restricted 
driver nodes set, and p, the maximum length of the controlling path. 
Input: A directed graph G = (V, F), a set of target nodes T C V, |T| =k, 
a set of potential driver nodes S C V, |S| = s, and an integer p, the maximum 
length of a controlling path from a driver to a target node. 

Output: A set of nodes U C S$ of minimum cardinality that controls a 
maximum subset of 7’, i.e., the subset of T controllable from S’. 


1. For each d; € S,1 <7 < s compute the sets Tate sis Ths where 


T. iu C T,7 <p contains all those target nodes t € T from which there 
exists a directed path of length 7 from d; to t, i.e., the nodes from T’ 
which are controllable in exactly 7 steps from dj. 


2. Compute all possible sets Ty, such that exactly one element for each 
i CT,j <p is added to Ty,. 


3. Compute all possible unions 7s for each choices of the sets Ty,, i-e., 
Ts = {Ua,esr Ta; |S’ CS and Ty, computed from above}. 


return minimal S’ such that there exists Ty,, d; € S’ such that Uae ot La; 
is a maximal element of 7g. 


either one of the possible choices of Ty, or 0, if in that configuration d; ¢ S’. 
Thus, we have to assemble a total of O((k?)*) sets. 

In order to output the result we have to keep track of the maximal 
elements of the above sets, as well as the underlying S’ C S which generate 
them. Thus, the complexity of the algorithm is in O(nk?*). 


We mention that at the end of Algorithm 4 we can also output the 
elements of the set U a,cs’ Ld;, Which represents the subset of target nodes 
controlled from S$". 


Note regarding Algorithm 4: Despite the major reduction of the algorithmic 
complexity of the STC problem for the restricted case, even moderate sized 
instances, e.g. Network 3 from Table 1 which has 67 nodes, 14 targets, 
and 15 potential driver nodes, the algorithm does not end in a reasonable 
amount of time, i.e. 24 hours on a powerful desktop computer. This tends 
to suggest that either a considerable improvement needs to be performed 
to such an exhaustive search algorithm, or the real-case instances of this 
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problem remains to be tackled only by approximation correspondents. 

In Section 5 we introduce a new variation of Algorithm 4 which has a 
considerable improved complexity, leading it to the possibility of analyzing 
even real-case instances. The efficiency in the running of the algorithm 
comes with the drawback that our algorithm is not guaranteed to always 
return the optimal solution. Nevertheless, we show that our algorithm is 
almost always optimal. 


4 Hardness of Approximation 


In this section, we show that the Structural Target Controllability (optimiza- 
tion) problem cannot be approximated within a factor of (1 — «)Ink,Ve > 0, 
unless NP C DTIM E(n'°8!°8*), where k is the number of nodes in the tar- 
get set J’. We prove this via an approximation preserving reduction from the 
Set Cover problem (see Definition 3). Feige [8] showed that Set Cover is hard 
to approximate within (1 — «)Ink,Ve > 0, unless NP C DTIM E(k°8!°8*), 
where & is the number of elements in the universe. 


Theorem 4 Unless NP C DTIME(k'°8'°8*), the STC problem cannot be 
approximated within a factor of (1—«)Ink,Ve > 0. 


Proof: Given an instance of the Set Cover problem, i.e., a set U = 
{u1, U2,..., Ux} with k elements and n sets S1,.S2,...,5, CU, we construct 
the following instance of the STC problem. 


1. Add a node s; € V corresponding to each set 5S; in the Set Cover 
instance. 


2. Add a node t; € V corresponding to each element u; in the set U. 


3. For each $;, add gq; = |Sj|(|S;] — 1)/2 auxiliary nodes in V. We term 


these nodes a}, a, @3,...,@,- 


4. The target set T consists of all the nodes t; € V. 


5. For each set S; of the set cover instance, we construct |,;| paths 
of length 2,3,4...|S;| + 1 as follows. Let 5; = {u1,u2,.--ujg,}- 
Then we construct the paths: {5;,t1},{s:, a}, to}, {s;, a5, a5, t3},..., 


ee) a 1 
{$i 4, |5,)419 Fq:—19 ++ Mga Els, 
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We will now show that the Set Cover instance has a solution with « 
sets if and only if the target set JT’ can be controlled with x driver nodes. 
Thus, the existence of an approximation algorithm of (1 — €) lnk, for some 
€ > 0, implies the existence of an approximation algorithm with the same 
factor for the Set Cover problem which implies NP C DTIM E(n'88*), 

Given a Set Cover with x sets S;,,5;,,...5;,, then the driver nodes 
Si, , Si2,--- $i, Control all the target nodes since each s;, controls precisely 
the target nodes corresponding to the elements in S;,. This holds since each 
path from the node s;, to nodes in T has a different length. 

Conversely, given a set of x driver nodes that control all the target 
nodes, we reconstruct a valid Set Cover with x sets, by choosing the sets 
corresponding to the driver nodes. Thus, the theorem follows. 


5 An Algebraic Approach for Solving the Driver 
Restricted Structural Target Controllability 
Problem 


In this section we present a probabilistic heuristic algorithm for the DRSTC 
problem (Definition 4), algorithm that uses an algebraic approach. As ex- 
plained in Section 2, the STC problem has an innate algebraic representation 
(see Definition 1). In the case of a restricting set S C X of potential driver 
nodes, the input matrix B is restricted itself by selecting only those nodes 
from S, ie., for any S’ C S, S’ = {2;,, 2i,,.--, Ui,, } we define Bsr €e R™™ 
having non-zero values only on the m positions B(i;,i;), i <j <m. Thus, 
given such a restricting set S C X and a bound p < n on maximal length 
of a controlling path from a driver to a target node, the above algebraic 
formulation becomes: 


Compute a minimal subset S’ C SS, |S’| = ™m, such that 
Srank OC,(A, Bs, Cr) = Srank OC,(A, Bs, Cr), where OC,(A, Bs’, Cr) is 
the length p controllability matriz OC,(A, Bs,Cr) := [CrBs: | CrABs: | 
Cr A? Bs | aes | Cr A? Bs). 

As in the case of Algorithm 4, we are going to consider all the sub- 
sets S’ of S, by eliminating elements from S one-by-one. Then, we will 
(approximately) compute the generic rank Srank OC,(A, Bs, Cr), and we 
will compare it with the maximal choice Srank OC,(A, Bs, Cr). The generic 
rank of a matrix cannot be computed in polynomial time [7]. However, it is 
known from early works on structural network controllability [21, 29] that 
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for any LTIS (A, B,Cr), the set K = {(A,B,Cr) | rank OC(A, B,Cr) = 
Srank OC(A, B,Cr)} is open and dense with respect to operator norm and 
moreover, more importantly, its complement is of measure zero. That is, the 
set of values for the non-zero entries in the matrices A, B, and Cr for which 
rank OC,(A, B,Cr) < Srank OC,(A, B, Cr) is a very sparse set. Thus, for 
computing the generic rank Srank OC,(A, B, Cr), it is enough to compute 
the rank OC,(A, B, Cr) (e.g., using the Gaussian Elimination method) for 
one or several random valuations of the non-zero values in these matrices. 
Algorithm 5 is a min-max approximation algorithm for DRSTC, whose time 
complexity is exponential in the parameters s, corresponding to the size of 
the potential driver set S, times a polynomial in parameters s, p, n and k, 
corresponding to the maximal length of the controlling path from a driver 
to a target node, the total number of nodes, and the size of the target set, 
respectively. 

By increasing the number of times the rank computation is repeated, 
with different random valuations, the algorithm produces a solution closer 
to the optimum. By the algorithm’s design, the algorithm will never output 
a subset S’ which actually does not control a maximal subset of T. Notice 
that if we were able to compute exactly the generic rank of a matrix, then 
we can make Algorithm 4 an exact algorithm. 

Depending on the level of approximation desired, we choose the constant 
Q > 1, as explained in Note 3 below. From practical perspective it is enough 
to have Q = 3. In the next theorem, we show the correctness of Algorithm 5. 


Theorem 5 Given a graph G = (V,E), a target set T CV with |T| =k, 
a set of nodes from which the control can be initiated S C V with |S| = s, 
and an integer p corresponding to the maximal length of a controlling path 
from a driver to a target node, Algorithm 5 produces a feasible (but possibly 
suboptimal) solution for the Driver Restricted Target Controllability Problem 
in time O(2° x n°). 


Proof: In the following, we present in more details and analyze the 
running time of each step of Algorithm 5. 

Exploring all possible subsets of S has clearly complexity 2!5 (see Note 1 
below for a discussion on how to speed up this process in practice). 

The next step of the algorithm is to choose Q random valuations for the 
non-zero values of matrices A, Bs, and Cr: (A!, BL, Ch)... (A®, Be Cay 
For all these valuations compute the rank of OC,(A, Bs, Cr) as: 


OC,(A, Bs, Cr) := [CrBs | CrABs | CrA*Bg |... | CrA’ Bs] « R**?S, 
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Algorithm 5 A probabilistic heuristic algorithm for the Driver Restricted 
STC problem parameterized by k, the size of the target set, s, the size of 
the restricted driver nodes set, and p, the maximal length of the controlling 
path. 

Input: A directed graph G = (V, F), a set of target nodes T C V, |T| =k, 
a set of potential driver nodes S$ C V, |S| = s, and an integer p, the maximal 
length of a controlling path from a driver to a target node. 

Output: A set of nodes U C S$ of minimum cardinality that controls a 
maximal subset of 7’, i.e., the subset of JT’ controllable from S. 


e For each subset S’ C S: 


— Compute @ times the rank of OC,(A, Bs,Cr) for a random 
assignment of the non-zero elements of matrices A, Bs, and Cr 

— Let Srank’ OC,(A, Bs, Cr) be the maximum of the ranks com- 
puted at the previous step 


return minimal 5S’ such that  Srank’OC,(A, Bs, Cr) — 
Srank’ OC,(A, Bs, Cr) 


and approximate SrankOC,(A,Bs,Cr) as: Srank’OC,(A,Bs,Cr) := 
max{rank OC,(A!, B4,C}),..., rank OC,(A®, Be. OF )}. 

Computing OC,(A,Bs,Cr) for each of the valuations is performed 
in O(pn?) time and since p is bounded by n, computing OC,(A, Bs, Cr) 
takes at most O(n*) time. Also, the rank calculation can be performed 
using e.g. the Gaussian Elimination method. While this method involves 
O(n?) operations, the implementation of the method may create numbers 
with exponentially many bits. Nevertheless, there is a variant of Gaussian 
elimination, called the Bareiss algorithm[1], that avoids this exponential 
growth of the intermediate entries and has a time complexity of O(n°). 

For each of the subsets S’ C S explored above, derive the re- 
stricted valuations Behe out of the valuations for Bs. Then, as 
above, compute OC,(A', Bg,,C}) and its rank. If rank OC,(A', Bi,, Cp) < 
Srank’ OC,(A, Bs, Cr) re-compute the rank for the next valuation. We con- 
sider that Srank’ OC,(A, Bg, Cr) = Srank’ OC,(A, Bs, Cr) iff for at least 
one of the above valuations we have an equality. 


In order to output the result we have to keep track of the minimal 
S’ C S for which Srank’ OC,(A, Bs:,Cr) = Srank’ OC,(A, Bs,Cr). Also, 
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unless this was performed in the initial Gaussian Elimination method imple- 
mentation, we have to determine which of the lines in OC,(A, Bg’, Cr) are 
linearly independent, i.e., which of the targets from T are controlled from S$’. 
Thus, the time complexity of Algorithm 5 is O(2°n°). 


5.1 Notes on the Implementation of Algorithm 5 


Given the prohibitively high computational complexity of all previous al- 
gorithms, we only considered the implementation of Algorithm 5, which is 
itself a more real-life scenario-oriented version of Algorithm 4. Furthermore, 
while its computational complexity cannot be improved, we detail below a 
few algorithmic features which considerably boosted the efficiency of the 
implementation. 

Note 1: This note discusses a Branch and Bound improvement on the 
Algorithm 5. At the core of Algorithm 5 is a complete subset exploration 
of S, the collection of potential driver nodes. Moreover, we are interested 
in finding a minimal $”’ C § which can control as much as the entire S. 
Thus, in our exploration, it makes sense to explore these subsets only as 
long as they are smaller than the best subset identified so far. In this way 
the search tree is considerably pruned, significantly improving the run-time 
of the algorithm. 

Note 2: On a similar note, while the algorithm provides a complete subset 
exploration, there are a number of subsets that can safely be discarded. For 
example, in the most favorable scenario, each source node would control the 


maximum of p+ 1 target nodes (itself, and p others), so there must be at 


? 
least jms OCA Bs Cr) | source nodes in the solution. Similarly, in the 


least favorable scenario, each target node is controlled by a different source 
node, so there can be at most Srank’ OC,(A, Bs, Cyr) source nodes in the 
solution. 

Note 3: In our implementation of calculating the generic rank of matrices 
OC(A, B,C) we have seldom encountered cases when two different random 
valuations of the structural matrices A,B, and C would generate a different 
rank. More importantly, it has never happened that a third random valuation 
would generate yet another rank value. Thus, in our implementation of 
Algorithm 5 we have used Q = 3. 

Note 4: As the algorithm is intended to be applied in the biological domain, 
where shorter path lengths are to be desired due to the quick dissipation 
of a drug’s effects over long signalling pathways, we used a maximum path 
length of p= 5. 
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5.2 Results of Algorithm 5 


In order to assess the efficiency and the utility of Algorithm 5 for the DRSTC 
problem, we have run it on several real-life networks of different domains. 
Firstly, we used the social networks documented in [30] (1) and [18] (2), 
where the nodes represent people and the edges the positive sentiment from 
one person to another. Then, we considered the electronic networks presented 
in [19] (1, 2, 3), where the nodes represent logic gates and the edges the 
connections between them. Finally, we generated several protein-protein 
interaction networks starting from the essential genes for breast cancer (1, 3) 
and pancreatic cancer (2) described in [14], using the OmniPath [6] database, 
and considering the interactions between the essential genes (1, 2), or the 
interactions between the essential genes with one intermediary gene (3). For 
each network, we randomly generated three sets of target nodes (with the 
sizes of 10, 20 and 30) from the set of nodes with at least one incoming edge. 
Similarly, for each network, we also randomly generated three sets of source 
nodes from the set of nodes with at least one outgoing edge. If not enough 
suitable nodes existed in a network to form a set of certain size, then the 
corresponding run of the algorithm was skipped. 

The results are presented in Table 1, and are available, together with the 
implementation and the data sets, at [23]. In Table 1, the last three columns 
are, CT: number of controlled target nodes; CS: number of controlling source 
nodes; TS: running time, in seconds. 

The runs that did not complete within two days have been omitted / 
marked with *. As it can be seen in Table 1, Algorithm 5 can be successfully 
applied on real-life networks. 

As expected, the parameters with the biggest influence on the running 
time are the size of the set of source nodes and the number of nodes in the 
network. While the total running time of the algorithm mostly depends 
on the total number of subset (and, thus, on the size of the set of source 
nodes) and was significantly decreased by skipping the invalid subsets and 
not taking into consideration the larger solutions once a smaller one was 
found, the running time per subset is dependent on the rank computations 
(which, in turn, depends on the number of nodes in the network, and to a 
lesser extent on the size of the sets of nodes). 

Furthermore, this strategy helps the algorithm perform faster on well- 
connected networks, where few source nodes can control the target set of 
nodes and, thus, are quickly found. This can be observed in the case of the 
largest protein-protein interaction network (which is the most well-connected 
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Network Nodes Edges Targets Sources CT Cs TS 

TO 0 10 2 0.11 

10 20 10 2 2.5 
aa ; 32 96 10 30 10 2 2863.6 
nteraction 1 20 10 20 4 0.23 

20 20 20 4 2.39 

20 30 20 4 2886.38 

TO 10 10 3 0.22 

10 20 10 2 2.43 
Sciat 10 30 10 2 5766.48 
cane 67 182 20 10 20 4 0.3 

20 20 20 4 2.95 

30 10 30 6 0.53 

30 20 29 5 3.4 

TO 10 3 3 0.28 

10 20 8 3 73.62 
See 10 30 9 5 29228.78 
Cigchiet 122 189 20 10 10 6 2.9 

20 20 18 6 284.15 

30 10 13 5 2.63 

30 20 25 8 655.04 

TO 10 3 3 1.21 

10 20 3 3 11.62 
Biscteanie 10 30 6 3 8764.32 
Ginnie 208 189 20 10 5 5 5.46 

20 20 5 4 120.75 

30 10 7 7 12.76 

30 20 3 3 6.08 

TO 10 4a 3 10.53 

10 20 3 2 26.15 
Electronic eZ 20 10 10 6 150.67 
Circuit 3 pas ote 20 20 8 6 5482.52 

30 10 12 i 158.38 

30 20 12 6 106267.06 

10 0 7 3 0.19 

10 20 8 3 2.73 

: ; 10 30 10 4 6870.62 

aaa 64 94 20 10 12 4 0.32 

20 20 18 8 387.45 

30 10 18 4 0.31 

30 20 28 11 843.31 

TO TO 7 3 0.18 

10 20 10 4 6.07 

: : 10 30 10 4 3951.35 

ete c |B8 64 20 10 8 4 0.35 

20 20 11 6 390.33 

30 10 13 4 0.37 

30 20 20 10 1041.5 

TO TO 10 3 5.85 

10 20 10 3 17.89 
Beet eens 10 30 10 3 2921.13 
Take teationS 433 1604 20 10 18 4 11.68 

20 20 18 4 108.76 

30 10 29 7 42.74 

30 20 28 6 1474.74 


Table 1: Results of Algorithm 5. 


of the analyzed networks), on which the algorithm completed significantly 
faster for all sets compared to the largest similarly-sized electronic circuit 
(which is less-connected). 

The order of the nodes within the sets of source nodes also has a large 
influence on the final running time. For example, if the last node in the 
list would be required for the best solution, then the said best solution can 
only be found within the last half of the checked subsets (based on the order 
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the algorithm generates and iterates through them), while the algorithm 
would still check all previous subsets, resulting in a significant increase of 
the running time. 

However, the final running time could be further improved by decreasing 
the number of times that the rank of the corresponding matrices is computed, 
further shortening the maximum path length, or by using a different order 
(and stopping condition) for analyzing the subsets. 


6 Conclusions and Future Work 


Network Science has been proven to be highly relevant within the current 
developments of medicine and personalized therapeutics. Within this field, 
structural network control is a powerful and efficient tool for steering the 
involved bio-medical systems towards desirable configurations. Thus, the 
algorithmic optimization problems studied in this manuscript are relevant for 
the computational bio-medicine community, as highly optimized solutions 
have a significant chance of translating into efficient therapeutics. Although 
the Structural Target Control (Optimization) problem has been proven to 
be NP-hard in its general case and can not even be approximated within a 
constant factor, and although it is a known fact that bio-medical networks 
are rather large, containing thousands of nodes and (tens of thousands of) 
interactions, in practice, several of the involved parameters can still be 
considerably bounded to significantly lower values. In this research, we 
took advantage of these insights in order to provide several optimization 
algorithms which remain of low polynomial complexity with regards to the 
size of the network, and are exponential only in those chosen parameters. 
Out of these algorithms, one in particular has been shown to be tractable for 
real-case networks containing up to 200+ nodes. Moreover, on all non-trivial 
test-cases, this latter algorithm gave more detailed and complete output 
than the current state of the art software dealing with this problem. 
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